Last updated: 2021-12-07

Checks: 6 1

Knit directory: mapme.protectedareas/

This reproducible R Markdown analysis was created with workflowr (version 1.6.2). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.


The R Markdown file has unstaged changes. To know which version of the R Markdown file created these results, you’ll want to first commit it to the Git repo. If you’re still working on the analysis, you can ignore this warning. When you’re finished, you can run wflow_publish to commit the R Markdown file and build the HTML.

Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

The command set.seed(20210305) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.

Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.

The results in this page were generated with repository version d332a1b. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .Rproj.user/

Unstaged changes:
    Modified:   analysis/descriptives.Rmd

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.


These are the previous versions of the repository in which changes were made to the R Markdown (analysis/descriptives.Rmd) and HTML (docs/descriptives.html) files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view the files as they were in that past version.

File Version Author Date Message
Rmd d332a1b Yota Eilers 2021-12-07 Updates and additional plots
html d332a1b Yota Eilers 2021-12-07 Updates and additional plots
Rmd 1163f70 Yota Eilers 2021-11-25 Initialize descriptives.Rmd file + inserted from old Rmd file
html 1163f70 Yota Eilers 2021-11-25 Initialize descriptives.Rmd file + inserted from old Rmd file

Introduction

Relative forest cover – exploratory

All observations

Outliers excluded

The following table displays the two outliers that will be excluded in the subsequent analyses:

wdpa_pid NAME ISO3 AREA_KM2
31758 Área De Proteção Ambiental Serra Da Tabatinga BRA 417.9696
30628 Río Indio Maíz NIC 3161.2738

Without the outliers, the y scale can be cut off at around 0.7 at the bottom and 1.3 at the upper end. That way, we can see what’s going on in most projects in more detail:

From this point onwards, all plots exclude aforementioned outliers.

Grouped by countries

Grouped by accessibililty

A first look at accessibility:

Version Author Date
d332a1b Yota Eilers 2021-12-07
[1]  1881  1998 65535 65535 65535 65535

It seems like one PA has a wrong travel time: 65,535 mins = 1092.25h! This PA is located on an island. For now, we will drop the corresponding observations (one PA, 5 ‘projects’):

Version Author Date
d332a1b Yota Eilers 2021-12-07

Here, we set alternative cutoff vectors for travel time:

cutoffs_travel_time <- 
  c(0,120,300,max(wdpa_accessibility$travel_time_to_nearby_cities_min,na.rm = T))

cutoffs_travel_time_alt2 <- 
  c(0,120,240,300,max(wdpa_accessibility$travel_time_to_nearby_cities_min,na.rm = T))

cutoffs_travel_time_alt3 <- 
  c(0,60,120,300,max(wdpa_accessibility$travel_time_to_nearby_cities_min,na.rm = T))

cutoffs_travel_time_alt4 <- 
  c(0,30,60,300,max(wdpa_accessibility$travel_time_to_nearby_cities_min,na.rm = T))

Grouped by alternative 1:

Grouped by alternative 2:

Grouped by accessibility, country

Grouped by accessibility, COL

Grouped by size

cutoffs_size <-
  c(0,150,3000,max(wdpa_info$AREA_KM2,na.rm = T)) # these breaks should be revised
cutoffs_size_2 <- 
  quantile(inner_join(gfw_kfw_data, wdpa_info, by=c("wdpa_pid"))$AREA_KM2)

      (0.0758,115]          (115,682]     (682,2.74e+03] (2.74e+03,5.1e+04] 
              2247               2247               2247               2247 

Grouped by disbursements

Warning: Removed 4 rows containing non-finite values (stat_bin).

cutoffs_disb <-
  c(0,100,1000,10000,max(finance_data$disb_yr_km2,na.rm = T)) # these breaks should be revised

Relative forest cover: Fitted lines

In this section, we divide the observations into a pre- and post-treatment group:

  • Pre-treatment group: year of observation before or at project start
  • Post-treatment group: year of observation after project start (starting at year one)

Slope coefficients


(Slope) coefficients with break at relative year == 1
========================================================
                                 Dependent variable:    
                             ---------------------------
                                 area_pct_projstart     
--------------------------------------------------------
Dummy before vs. after start          -0.004***         
                                       (0.001)          
                                                        
Relative year                         -0.001***         
                                      (0.0001)          
                                                        
Relative year X Dummy                  0.0001           
                                      (0.0001)          
                                                        
Constant                              1.000***          
                                       (0.001)          
                                                        
--------------------------------------------------------
Observations                            9,009           
R2                                      0.188           
Adjusted R2                             0.187           
Residual Std. Error               0.023 (df = 9005)     
F Statistic                   693.886*** (df = 3; 9005) 
========================================================
Note:                        *p<0.1; **p<0.05; ***p<0.01

The regression output shown above displays the slope coefficients of the graphs in the following plots. Our variable of interest is Relative year X Dummy. A positive value indicates a reduction of forest cover losses, i.e. a smaller rate of forest cover decreases, after start of the project.

In our case, the slope coefficient is positive but small in size and statistically insignificant.

Fitted lines and scattered points

Version Author Date
d332a1b Yota Eilers 2021-12-07
1163f70 Yota Eilers 2021-11-25

Fitted lines and CIs only

Version Author Date
d332a1b Yota Eilers 2021-12-07
1163f70 Yota Eilers 2021-11-25
  • D: Difference in intercepts of the two regression lines
  • Note: all values (including the intercepts) are estimated! Thus, does not necessarily equal one, even though we know that all values at normal zero are equal to one. The intercept value depends on the value of the coefficient of relative year.
  • In fact, the intercept for pre-treatment is not estimated to be zero either (see parameters displayed in the plots)

Version Author Date
d332a1b Yota Eilers 2021-12-07

Grouped by accessibility bins

The following plot divides our observations (protected PAs) into three ‘accessibility groups’, i.e. groups with similar travel time to the next city. This way, we can observe the relationship between forest cover loss dynamics (in general, and pre- and post-treatment trends more specifically) and accessibility of the PA. A change in the slope coefficient (within colored groups) allow statements about reductions or increases in forest cover loss dynamics.

Version Author Date
d332a1b Yota Eilers 2021-12-07
1163f70 Yota Eilers 2021-11-25

Description:

  • Flatter line (smaller absolute slope) in the post-treatment group, compared to pre-treatment
  • Strongest dynamic for protected areas with 2–6h travel time to the next city
  • Largest ‘reduction’ of forest cover losses after project start for protected areas with 2–6h travel time to the next city

Interpretation/ Discussion:

  • Are protected areas located 2–6h away from the next city the most threatened PAs?
  • PAs with higher accessibility (i.e. less travel time) may have a status as ‘local recreation area’, whereas forest areas with medium accessibility are at the current deforestation front.
  • Do we observe less forest cover losses in PAs close to cities because forest areas close to settlement areas have already had significant losses in the past (‘not much left to deforest’)?
  • Note: All these plots only include protected areas -> no statements about forest areas in general

Follow-up:

  • Should accessibility to the next city be normalized? Rationale: Substantial differences in country sizes

Alternative accessibility bins

By using alternative ‘accessibility bins’, one can see heterogeneous forest cover loss dynamics within the 2–6h travel time group. A change in the slope coefficient (within colored groups) allow statements about reductions or increases in forest cover loss dynamics.

Version Author Date
d332a1b Yota Eilers 2021-12-07
1163f70 Yota Eilers 2021-11-25

Description:

  • The 2–6h travel time group exhibits strongly hetergeneous forest cover loss dynamics.
  • Substantially higher forest cover losses in the 4–6h travel time group.
  • The reduction of losses in the 2–6h group (see plot above) is largely driven by reduced losses in the 2–4h group.
  • Losses in the 4–6h group have, on average, become larger since project start.

What about sample size in the subgroups?

The following table displays the number of observations for each travel time group (number of observations = number of data points; not number of ‘projects’). As indicated by the large confidence bands for the 4–6h group, the sample size in this group is rather small compared to the sample size in the other groups. Note, however, that the shown figures do not reflect the number of observations in each normalized year.


    (0,120]   (120,240]   (240,300] (300,2e+03] 
       4893        1260         231        1470 

To do:

  • What are sensible cutoff points for the accessibility variable, considering differences between <5k and <20k data set?

Alternative accessibility bins #2

Version Author Date
d332a1b Yota Eilers 2021-12-07

     (0,60]    (60,120]   (120,300] (300,2e+03] 
       3801        1092        1491        1470 

Grouped by accessibility bins, selected country

Version Author Date
d332a1b Yota Eilers 2021-12-07

Relative forest cover: Boxplots

Version Author Date
d332a1b Yota Eilers 2021-12-07
1163f70 Yota Eilers 2021-11-25

R version 3.6.3 (2020-02-29)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.6 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1

locale:
 [1] LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8       
 [4] LC_COLLATE=C.UTF-8     LC_MONETARY=C.UTF-8    LC_MESSAGES=C.UTF-8   
 [7] LC_PAPER=C.UTF-8       LC_NAME=C              LC_ADDRESS=C          
[10] LC_TELEPHONE=C         LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C   

attached base packages:
[1] grid      stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
 [1] ggpmisc_0.4.4   ggpp_0.4.2      stargazer_5.2.2 forcats_0.5.1  
 [5] stringr_1.4.0   dplyr_1.0.7     purrr_0.3.4     readr_1.4.0    
 [9] tidyr_1.1.4     tibble_3.1.6    tidyverse_1.3.1 plotly_4.9.3   
[13] ggplot2_3.3.4   sf_1.0-4       

loaded via a namespace (and not attached):
 [1] nlme_3.1-152       matrixStats_0.57.0 fs_1.5.0           lubridate_1.7.10  
 [5] httr_1.4.2         rprojroot_2.0.2    tools_3.6.3        backports_1.2.1   
 [9] bslib_0.2.5.1      utf8_1.2.2         R6_2.5.1           KernSmooth_2.23-20
[13] mgcv_1.8-36        DBI_1.1.1          lazyeval_0.2.2     colorspace_2.0-1  
[17] withr_2.4.2        tidyselect_1.1.1   compiler_3.6.3     git2r_0.28.0      
[21] cli_3.1.0          rvest_1.0.0        quantreg_5.86      SparseM_1.78      
[25] xml2_1.3.2         labeling_0.4.2     sass_0.4.0         scales_1.1.1      
[29] classInt_0.4-3     proxy_0.4-26       digest_0.6.27      rmarkdown_2.11    
[33] pkgconfig_2.0.3    htmltools_0.5.1.1  highr_0.8          dbplyr_2.1.1      
[37] htmlwidgets_1.5.3  rlang_0.4.12       readxl_1.3.1       rstudioapi_0.13   
[41] farver_2.1.0       jquerylib_0.1.4    generics_0.1.1     jsonlite_1.7.2    
[45] crosstalk_1.1.1    magrittr_2.0.1     polynom_1.4-0      Matrix_1.3-4      
[49] Rcpp_1.0.7         munsell_0.5.0      fansi_0.5.0        lifecycle_1.0.1   
[53] stringi_1.7.6      whisker_0.4        yaml_2.2.1         promises_1.2.0.1  
[57] crayon_1.4.2       lattice_0.20-44    splines_3.6.3      haven_2.3.1       
[61] hms_1.1.1          knitr_1.34         pillar_1.6.4       reprex_2.0.0      
[65] glue_1.5.1         evaluate_0.14      data.table_1.13.6  modelr_0.1.8      
[69] vctrs_0.3.8        httpuv_1.6.1       MatrixModels_0.4-1 cellranger_1.1.0  
[73] gtable_0.3.0       assertthat_0.2.1   xfun_0.24          broom_0.7.6       
[77] e1071_1.7-9        later_1.2.0        class_7.3-19       viridisLite_0.4.0 
[81] conquer_1.0.2      workflowr_1.6.2    units_0.7-2        ellipsis_0.3.2